17 research outputs found

    The ESCAPE project: Energy-efficient Scalable Algorithms for Weather Prediction at Exascale

    Get PDF
    Abstract. In the simulation of complex multi-scale flows arising in weather and climate modelling, one of the biggest challenges is to satisfy strict service requirements in terms of time to solution and to satisfy budgetary constraints in terms of energy to solution, without compromising the accuracy and stability of the application. These simulations require algorithms that minimise the energy footprint along with the time required to produce a solution, maintain the physically required level of accuracy, are numerically stable, and are resilient in case of hardware failure. The European Centre for Medium-Range Weather Forecasts (ECMWF) led the ESCAPE (Energy-efficient Scalable Algorithms for Weather Prediction at Exascale) project, funded by Horizon 2020 (H2020) under the FET-HPC (Future and Emerging Technologies in High Performance Computing) initiative. The goal of ESCAPE was to develop a sustainable strategy to evolve weather and climate prediction models to next-generation computing technologies. The project partners incorporate the expertise of leading European regional forecasting consortia, university research, experienced high-performance computing centres, and hardware vendors. This paper presents an overview of the ESCAPE strategy: (i) identify domain-specific key algorithmic motifs in weather prediction and climate models (which we term Weather & Climate Dwarfs), (ii) categorise them in terms of computational and communication patterns while (iii) adapting them to different hardware architectures with alternative programming models, (iv) analyse the challenges in optimising, and (v) find alternative algorithms for the same scheme. The participating weather prediction models are the following: IFS (Integrated Forecasting System); ALARO, a combination of AROME (Application de la Recherche à l'Opérationnel à Meso-Echelle) and ALADIN (Aire Limitée Adaptation Dynamique Développement International); and COSMO–EULAG, a combination of COSMO (Consortium for Small-scale Modeling) and EULAG (Eulerian and semi-Lagrangian fluid solver). For many of the weather and climate dwarfs ESCAPE provides prototype implementations on different hardware architectures (mainly Intel Skylake CPUs, NVIDIA GPUs, Intel Xeon Phi, Optalysys optical processor) with different programming models. The spectral transform dwarf represents a detailed example of the co-design cycle of an ESCAPE dwarf. The dwarf concept has proven to be extremely useful for the rapid prototyping of alternative algorithms and their interaction with hardware; e.g. the use of a domain-specific language (DSL). Manual adaptations have led to substantial accelerations of key algorithms in numerical weather prediction (NWP) but are not a general recipe for the performance portability of complex NWP models. Existing DSLs are found to require further evolution but are promising tools for achieving the latter. Measurements of energy and time to solution suggest that a future focus needs to be on exploiting the simultaneous use of all available resources in hybrid CPU–GPU arrangements

    Additional file 1: Figure S1. of SeqPurge: highly-sensitive adapter trimming for paired-end NGS data

    No full text
    Mapping of reads without insert. Figure S2. Variant that is suppressed by adapter contamination. Figure S3. High-quality reads removed by SeqPrep (example 1). Figure S4. High-quality reads removed by SeqPrep (example 2). Table S1. Benchmark results with low-quality trimming. Table S2. Detailed benchmark results on simulated data. (DOCX 184 kb

    Progesterone 5β-reductases/iridoid synthases (PRISE): gatekeeper role of highly conserved phenylalanines in substrate preference and trapping is supported by molecular dynamics simulations

    No full text
    <p>Vein Patterning 1 (<i>VEP1</i>)-encoded progesterone 5β-reductases/iridoid synthases (PRISE) belong to the short-chain dehydrogenase/reductase superfamily of proteins. They are characterized by a set of highly conserved amino acids in the substrate-binding pocket. All PRISEs are capable of reducing the activated C=C double bond of various enones enantioselectively and therefore have a potential as biocatalysts in bioorganic synthesis. Here, recombinant forms of PRISEs of <i>Arabidopsis thaliana</i> and <i>Digitalis lanata</i> were modified using site-directed mutagenesis (SDM). In r<i>Dl</i>P5βR, a set of highly conserved amino acids in the vicinity of the catalytic center was individually substituted for alanine resulting in considerable to complete loss of enone reductase activity. F153 and F343, which can be found in most PRISEs known, are located at the outer rim of the catalytic cavity and seem to be involved in substrate binding and their role was addressed in a series of SDM experiments. The wild-type PRISE accepted progesterone (large hydrophobic 1,4-enone) as well as 2-cyclohexen-1-one (small hydrophilic 1,4-enone), whereas the double mutant r<i>At</i>P5βR_F153A_F343A converted progesterone much better than the wild-type enzyme but almost lost its capability of reducing 2-cyclohexen-1-one. Recombinant <i>Draba aizoides</i> P5βR (r<i>Da</i>P5βR) has a second pair of phenylalanines at position 156 and 345 at the rim of the binding site. These two phenylalanines were introduced into r<i>At</i>P5βR_F153A_F343A and the resulting quadruple mutant r<i>At</i>P5βR_F153A_F343A_V156F_V345F partly recovered the ability to reduce 2-cyclohexen-1-one. These results can best be explained by assuming a trapping mechanism in which phenylalanines at the rim of the substrate-binding pocket are involved. The dynamic behavior of individual P5βRs and mutants thereof was investigated by molecular dynamics simulations and all calculations supported the ‘gatekeeper’ role of phenylalanines at the periphery of the substrate-binding pocket. Our findings provide structural and mechanistic explanations for the different substrate preferences seen among the natural PRISEs and help to explain the large differences in catalytic efficiency found for different types of 1,4-enones.</p

    Next generation sequencing of the clonal <i>IGH</i> rearrangement detects ongoing mutations and interfollicular trafficking in <i>in situ</i> follicular neoplasia

    No full text
    <div><p>Follicular lymphoma (FL) is characterized genetically by a significant intraclonal diversity of rearranged immunoglobulin heavy chain (<i>IGH</i>) genes and a substantial cell migration activity (follicular trafficking). Recently, <i>in situ</i> follicular neoplasia (ISFN), characterized by accumulations of immunohistochemically strongly BCL2-positive, t(14;18)+ clonal B cells confined to germinal centers in reactive lymph nodes, has been identified as a precursor lesion of FL with low risk of progression to manifest FL. The extent of ongoing somatic hypermutation of rearranged IGH genes and interfollicular trafficking in ISFN is not known. In this study we performed an in depth analysis of clonal evolution and cell migration patterns in a case of pure ISFN involving multiple lymph nodes. Using laser microdissection and next generation sequencing (NGS) we documented significant intraclonal diversity of the rearranged IGH gene and extensive interfollicular migration between germinal centers of the same lymph node as well as between different lymph nodes. Furthermore, we identified N-glycosylation motifs characteristic for FL in the CDR3 region.</p></div

    Diagnosing hereditary ataxias in a cohort of consanguine patients using a Next-Generation-Sequencing panel

    No full text
    <p>Background: Hereditary ataxias impose a relevant challenge when molecular diagnosis is sought. While more than 100 genes are involved in Mendelian diseases with ataxia, only a small proportion of these genes have been systematically tested in cohorts of patients with a consanguine family history. With the advent of next-generation-sequencing (NGS) a massive sequencing approach can be implemented with relatively ease. We investigated the occurrence of disease causing variants sequencing a cohort of closely related patients recruited for the EUROSCA and NEUROMICS EU projects respectively. The families originated mainly from the Mediterranean area. Each patient was strictly selected to avoid sequencing of persons suffering non hereditary kinds of ataxia or ataxia due to triplet repeat enrichment.</p> <p>Methods: We have established a selector-based enrichment method (HaloPlex, Agilent) specifically targeting 140 known ataxia genes as well as genes causal for rare diseases possessing a phenotypic overlap with ataxia. The panel covers most known genes causal for pure ataxia, mitochondrial ataxia and metabolic ataxia as well. A total of 582kb genomic DNA is specifically enriched and sequenced by Illumina MiSeq (2x 150 bp paired-end). Data analysis is accomplished using an in house bioinformatics pipeline based on ANNOVAR.</p> <p>Results: Although massive parallel sequencing usually brings up a couple of variants (Ø 384 ± SD 16), filtering for rare variants (in our own NGS database and in 1000g, ESP6500) and for functional relevance (ns,ss,indel) reduced this count to Ø 20 ± SD 4. A statistical evaluation of the panels performance shows superior coverage (Ø > 96 % cov 20X ± SD 1,8) and target enrichment values (Ø 178 ± SD 48 mapping depth on target) as well. Several disease causing mutations could be identified in genes like APTX, FGF14, NPC1, PLEKHG4, SACS, SETX, SIL1, SPTBN2, SYNE1 and many others.</p> <p>Conclusion: A panel sequencing approach offers a cheap and fast possibility to screen large patient cohorts for rare disease causing variants. Focusing on patients with a consanguine family background allows the discovery of rare and new variants for ataxia in a relatively high frequency.</p

    Phylogenetic tree of the eleven sequence groups.

    No full text
    <p>Spatial distance of the groups corresponds to the number of diverging base pairs. Dotted line indicates further distant relation to the three single sequences, which could not be assigned to the eleven groups.</p

    <i>IGH</i> clonality analysis of analysed follicles.

    No full text
    <p>A) Electropherograms of VDJ rearrangement clonality analysis using the framework 2 (FR2) primer set show a clonal product of 268 base pairs size. DNA from complete lymph node section. B) Electropherograms of six microdissected samples containing single or pooled ISFN lesions which showed products by fragment analysis based PCR amplification. Green arrows indicate the clonal 268 base pair products.</p

    Identification of ISFN-associated reads and read groups.

    No full text
    <p>A) Table of the samples according to their total reads, percentage of specific and nonspecific reads and their groups. B) Numbers of specific reads which were identified in the different samples. Productive rearrangements are depicted as coloured bars indicating the assignment to the cluster groups; grey bars indicate unproductive rearrangements. In five samples no specific reads were identified. C) All specific reads clustered into 11 groups of identical CDR3 amino acid sequence based on sequence similarities. Columns indicate the number of respective sequences which are assigned to a group.</p

    Dicsover putative disease-causing mutations in patients with ataxia or paraplegia by using Next -Generation -Sequencing panels

    No full text
    <p>Within the last decade, knowledge could greatly be expanded about genetics of neurodegenerative diseases such as highly heterogeneous hereditary ataxia and paraplegias. To date, more than 100 genes have been identified to be involved in Mendelian causes of ataxia and paraplegia. Despite this knowledge, only a subset of these genes has been tested systematically for putative disease-causing mutations in patient cohorts. Here we investigated two NeurOmics cohorts for the occurrence of disease-causing variants.</p> <p>To accomplish our aim, we investigated samples which are from pre-screened patients from families with at least two affected members. Therefore we established two separate ataxia and paraplegia specific selector probe based enrichment assays (HaloPlex, Agilent). The ataxia probe set contains 140 genes (classical, mitochondrial and metabolic ataxia) including genes from rare diseases showing a partial ataxia phenotype. The paraplegia panel contains 98 genes. A total target region of 582kbp (ataxia) / 267kbp (paraplegia) was specifically enriched and sequenced by use of an Illumina MiSeq next generation sequencing platform with 2x 150bp read paired-end runs. Sequenced reads were analysed by an in-house bioinformatics pipeline.</p> <p>Currently, 31 NeurOmics patients (22 with ataxia, 9 with paraplegia) have been analysed. We achieved a high efficiency with the HaloPlex platform. On average, >95% (ataxia group) / >97% (paraplegia group) of the target region were covered by >20 reads with a mean base coverage of 176 ± 16 (ataxia) and 285 ± 131 (paraplegia).</p> <p>Mapping the reads to the human genome (hg19) followed by annotations to different databases (ANNOVAR) resulted in a large number of variants (350 – 416 for ataxia / 158 – 177 for paraplegia). These lists of variants were further filtered for rare variants (in-house NGS database, 1000g, esp6500) and for functional relevance (ns, ss, indel) which led to a reduced number of 14 – 16 (ataxia) and 8 – 13 (paraplegia) variants per patient. Several rare, potentially disease causing mutations were found in different genes for diseases with ataxia (CACNA1A, FGF14, SPG7) and paraplegia (SPG11, KIF1C).</p> <p>The HaloPlex target enrichment method followed by Illumina MiSeq NGS sequencing has shown relevant technical advantages in contrast to other approaches (solid hybridization or exome sequencing). This workflow is more sensitive, relatively fast and cost-effective especially for diagnostic questions. It is useful for screening large patient cohorts with an unknown genetic cause. Within the NeurOmics project, for patients in whom no disease causing mutation could be detected by panel sequencing, whole exome sequencing might be an opportunity to find putative causal mutation.</p> <p> </p

    Highly Sensitive Detection of Surface and Intercalated Impurities in Graphene by LEIS

    No full text
    Low-energy ion scattering (LEIS) is known for its extreme surface sensitivity, as it yields a quantitative analysis of the outermost surface as well as highly resolved in-depth information for ultrathin surface layers. Hence, it could have been generally considered to be a suitable technique for the analysis of graphene samples. However, due to the low scattering cross section for light elements such as carbon, LEIS has not become a common technique for the characterization of graphene. In the present study we use a high-sensitivity LEIS instrument with parallel energy analysis for the characterization of CVD graphene transferred to thermal silica/silicon substrates. Thanks to its high sensitivity and the exceptional depth resolution typical of LEIS, the graphene layer closure was verified, and different kinds of contaminants were detected, quantified, and localized within the graphene structure. Utilizing the extraordinarily strong neutralization of helium by carbon atoms in graphene, LEIS experiments performed at several primary ion energies permit us to distinguish carbon in graphene from that in nongraphitic forms (e.g., the remains of a resist). Furthermore, metal impurities such as Fe, Sn, and Na located at the graphene–silica interface (intercalated) are detected, and the coverages of Fe and Sn are determined. Hence, high-resolution LEIS is capable of both checking the purity of graphene surfaces and detecting impurities incorporated into graphene layers or their interfaces. Thus, it is a suitable method for monitoring the quality of the whole fabrication process of graphene, including its transfer on various substrates
    corecore